OWL-CM: OWL Combining Matcher based on Belief Functions Theory

نویسندگان

Boutheina Ben Yaghlane

Najoua Laamari

چکیده

In this paper we propose a new tool called OWL-CM (OWL Combining Matcher) that deals with uncertainty inherent to ontology mapping process. On the one hand, OWL-CM uses the technique of similarity metrics to assess the equivalence between ontology entities and on the other hand, it incorporates belief functions theory into the mapping process in order to improve the effectiveness of the results computed by different matchers and to provide a generic framework for combining them. Our experiments which are carried out with the benchmark of Ontology Alignment Evaluation Initiative 2007 demonstrate good results. 1 Presentation of the system 1.1 State, purpose, general statement Semantic heterogeneity has been identified as one of the most important issue in information integration [5]. This research problem is due to semantic mismatches between models. Ontologies which provide a vocabulary for representing knowledge about a domain are frequently subjected to integration. Ontology mapping is a fundamental operation towards resolving the semantic heterogeneity. It determines mappings between ontologies. These mappings catch semantic equivalence between ontologies. Experts try to establish mappings manually. However, manual reconciliation of semantics tends to be tedious, time consuming, error prone, expensive and therefore inefficient in dynamic environments, and what’s more the introduction of the Semantic Web vision has underscored the need to make the ontology mapping process automatic. Recently, a number of studies that are carried out towards automatic ontology mapping draw attention to the difficulty to make the operation fully automatic because of the cognitive complexity of the human. Thus, since the (semi-) automatic ontology mapping carries a degree of uncertainty, there is no guarantee that the outputted mapping of existing ontology mapping techniques is the exact one. 2 Ben Yaghlane and Laamari In this context, we propose a new tool called OWL-CM (OWL Combining Matcher) with the aim to show how handling uncertainty in ontology mapping process can improve effectiveness of the output. 1.2 Specific techniques used On the one hand OWL-CM uses the Dempster-Shafer theory of evidence [11] to deal with uncertainty inherent to the mapping process, especially when interpreting and combining the results returned by different matchers. On the other hand it uses the technique of similarity measures in order to assess the correspondence between ontology entities. For the OWL-CM tool contest we have proposed an architecture (see figure 1) that contains four components. The transformer takes as input two ontologies (O1 and O2) and constructs for each one a database (DB1 and DB2). The database schema meets a standard schema that we designed based on some axioms of RDF(S) and OWL languages. The filters decide on result mappings. Whereas simple matchers and complex matchers assess the equivalence between entities. Fig. 1. OWL-CM Architecture. The corresponding algorithm that we have implemented follows four steps (see figure 2). The first step called pre-mapping is mainly devoted to convert each one of the input ontologies O1 and O2 into a database (DB1 and DB2). The following three ones allow performing sequentially the iteration about concepts mapping, followed by the iteration about object properties mapping, and ended by the iteration about datatype properties mapping. Each iteration is based on some methods belonging to four categories of tasks namely initialization, screening, handling uncertainty, and ending. The algorithm requires as input two ontologies to be mapped and two databases that have to be declared as OWL-CM based on Belief Functions Theory 3 ODBC data source systems. It outputs three lists of result mappings which are produced sequentially, each one is returned close of the corresponding iteration of mapping. The total result is returned in the form of a file. Fig. 2. OWL-CM Algorithm. 1.2.1 Preliminary concepts The following list draws up some of the preliminaries that are used by our approach. 1. Candidate Mapping: We define a candidate mapping as a pair of entities (ei, ej) that is not yet in map. 2. Result Mapping: We define a result mapping as a pair of entities that had been related, 〈ei,≡, ej〉 denotes that entity ei is equivalent to entity ej, whereas 〈ei,⊥, ej〉 denotes that the two entities are not equivalent. 3. Similarity measure: The similarity measure, sim, is a function defined in [3] based on the vocabularies ε1 of the ontology O1 and ε2 of the ontology O2 as follows: sim: ε× ε×O ×O → [0..1] sim(a, b) = 1 ⇔ a = b: two objects are assumed to be identical. sim(a, b) = 0 ⇔ a = b: two objects are assumed to be different and have no common characteristics. sim(a, a) = 1: similarity is reflexive. sim(a, b) = sim(b, a): similarity is symmetric. Similarity and distance are inverse to each other. A similarity measure function assesses the semantic correspondence between two entities based on some features. In table 1, we draw up the list of similarity measures employed depending on the type of entities to be mapped. Furthermore, we distinguish between two types of similarity: the syntactic one assessed by the measures that evaluate distance between strings (e.g., 4 Ben Yaghlane and Laamari String similarity and String equality) and the other measures dedicated to assess semantic similarity (e.g., String synonymy, Explicit equality and Set similarity). 4. SEE (Semantic Equivalent Entity): Depending on the type of entities, we formally define the semantic equivalence between two entities as follows: Definition (SEE) . An entity ej is semantically equivalent to an entity ei such that (ei, ej) ∈ {C1 × C2}, i.e., 〈ei,≡, ej〉, if at least one of the following conditions is true: simexpeql(ei, ej) = 1, or ∀ simk, with k = expeql, simk(ei, ej) = 1 An entity ej is semantically equivalent to an entity ei such that (ei, ej) ∈ {Rc ×Rc ∪Rd ×Rd}, i.e., 〈ei,≡, ej〉, if: ∀ simk, simk(ei, ej) = 1 Table 1. Features and Measures for Similarity Entities to be compared No. Feature (f) Similarity measure Concepts: C 1 (label, C1) simstrsim(C1, C2) 2 (sound (ID), C1) simstreql(C1, C2) 3 (label, C1) simstrsyn(C1, C2) 4 (C1,equalTo, C2) relation simexpeql(C1, C2) 5 (C1,inequalTo, C2) relation simexpineq(C1, C2) 6 all (direct-sub-concepts, S1) simsetsim(S1, S2) Relations: Rc 7 (sound (ID), R1) simstreql(R1, R2) 8 (domain, R1)∧(range, R1) simobjeql(R1, R2) 9 (domain, R1)∧(range, R1) simobjineq(R1, R2) 10 all (direct-sub-properties, S1) simsetsim(S1, S2) Relations: Rd 11 (sound (ID), R1) simstreql(R1, R2) 12 (domain, R1)∧(range, R1) simobjeql(R1, R2)∧ simstreql(R1, R2) 13 (domain, R1) simobjineq(R1, R2) 14 all (direct-sub-properties, S1) simsetsim(S1, S2) 5. USEE (Uncertain Semantic Equivalent Entity): We extend the definition of SEE to USEE in order to be used throughout the process of handling uncertainty when performing and combining matchers. Definition (USEE) . An entity that we said to be uncertain and semantically equivalent to an ontological entity e ∈ O1 is a pair (Θ, m), where: Θ = E, E ∈ {C2, Rc, Rd} m is a belief mass function (See Section 1.2.2). OWL-CM based on Belief Functions Theory 5 1.2.2 Handling uncertainty The Dempster-Shafer theory of evidence [11] presents some advantages that encourage us to choose among other theories. In particular, it can be used for the problems where the existing information is very fragmented, and so the information can not be modelled with a probabilistic formalism without making arbitrary hypotheses. It is also considered as a flexible modelling tool making it possible to handle different forms of uncertainty, mainly the ignorance. Moreover, this theory provides a method for combining the effect of different beliefs to establish a new global belief by using Dempster’s rule of combination. The belief mass function m(.) is the basic concept of this theory ([11], [12]). It assigns some belief mass in the interval [0,1] to each element of the power set 2 of the frame of discernment Θ. The total mass distributed is 1 and the closed world hypothesis (i.e. m(∅) = 0) is generally supported. In our work, Θ ∈ {C2, Rc, Rd}. The letter Φ in table 2 is the set of all candidate mappings. Table 2. Frame of Discernment and Candidate Mappings Set. e1 2 . . . em 2 e1 1 (e1 , e1 ) . . . (e1 , em ) . . . . . . . . . . . . en 1 (en , e1 ) . . . (en , em ) ⇒ Θ ⎫⎬ ⎭ In order to discover USEEs, we use n functions called matchers (matcherk). A matcher compared to a ”witness” that brings evidence in favor or against an advanced hypothesis. Matchers produce USEEs in order to support uncertainty. Some matchers are reliable than others. This is reflected in the confidence that is assigned to each matcher. The confidence is expressed like the mass that is distributed to Θ. For instance, ifmatcher1 has a confidence of .6, then the masses assigned to the subsets should be normalized to sum .6, and .4 should be always affected to Θ. We use Dempster’s rule of combination to aggregate the produced USEEs. Figure 3 illustrates the architecture that we propose to discover USEEs. In addition, this theory makes it possible to express total ignorance. For instance, if the set that contains the entities having the same sound as the entity in question is empty, then the matcher matcher2 will return a belief mass function m(Θ) = 1. 1.3 Adaptations made for the evaluation Our mapping algorithm has been recently conceived so to speak that our tool OWL-CM is in an alpha version and we evaluate it for the first time. 3 The index k is the No. of the matcher in the table 1. 6 Ben Yaghlane and Laamari Fig. 3. Architecture for discovering USEEs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Executive Approach Based On the Production of Fuzzy Ontology Using the Semantic Web Rule Language Method (SWRL)

Today, the need to deal with ambiguous information in semantic web languages is increasing. Ontology is an important part of the W3C standards for the semantic web, used to define a conceptual standard vocabulary for the exchange of data between systems, the provision of reusable databases, and the facilitation of collaboration across multiple systems. However, classical ontology is not enough ...

متن کامل

Characterizing Services Composeability and OWL-S Based Services Composition

Grid has emerged as a new paradigm for integration within dynamic virtual enterprises. Given a service-oriented Grid environment, more complex, value-added sophisticated services and applications can be built via service composition. In this paper, we discuss Grid services composition by leveraging Semantic Web services standards and technology, especially OWL-S. The characterization of service...

متن کامل

Combining SWRL rules and OWL ontologies with Protégé OWL Plugin, Jess, and Racer

The presentation concerns a draft implementation with Protégé OWL Plugin for SWRL, based on the RDF concrete syntax of the SWRL proposal. A first prototype of a SWRL Tab Widget has been achieved. It is a bridge between Protégé OWL, Racer, and Jess, intended to help reasoning with SWRL rules combined with OWL ontologies. A small example is given including an OWL ontology representing the family ...

متن کامل

Dione: An OWL representation of ICD-10-CM for classifying patients’ diseases

BACKGROUND Systematized Nomenclature of Medicine - Clinical Terms (SNOMED CT) has been designed as standard clinical terminology for annotating Electronic Health Records (EHRs). EHRs textual information is used to classify patients' diseases into an International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) category (usually by an expert). Improving the accuracy...

متن کامل

On Applying the AGM Theory to DLs and OWL

It is generally acknowledged that any Knowledge Base (KB) should be able to adapt itself to new information received. This problem has been extensively studied in the field of belief change, the dominating approach being the AGM theory. This theory set the standard for determining the rationality of a given belief change mechanism but was placed in a certain context which makes it inapplicable ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

OWL-CM: OWL Combining Matcher based on Belief Functions Theory

نویسندگان

چکیده

منابع مشابه

An Executive Approach Based On the Production of Fuzzy Ontology Using the Semantic Web Rule Language Method (SWRL)

Characterizing Services Composeability and OWL-S Based Services Composition

Combining SWRL rules and OWL ontologies with Protégé OWL Plugin, Jess, and Racer

Dione: An OWL representation of ICD-10-CM for classifying patients’ diseases

On Applying the AGM Theory to DLs and OWL

عنوان ژورنال:

اشتراک گذاری